Dimensionality Reduction of Protein Mass Spectrometry Data Using Random Projection

نویسندگان

  • Chen Change Loy
  • Weng-Kin Lai
  • Chee Peng Lim
چکیده

Protein mass spectrometry (MS) pattern recognition has recently emerged as a new method for cancer diagnosis. Unfortunately, classification performance may degrade owing to the enormously high dimensionality of the data. This paper investigates the use of Random Projection in protein MS data dimensionality reduction. The effectiveness of Random Projection (RP) is analyzed and compared against Principal Component Analysis (PCA) by using three classification algorithms, namely Support Vector Machine, Feed-forward Neural Networks and K-Nearest Neighbour. Three real-world cancer data sets are employed to evaluate the performances of RP and PCA. Through the investigations, RP method demonstrated better or at least comparable classification performance as PCA if the dimensionality of the projection matrix is sufficiently large. This paper also explores the use of RP as a pre-processing step prior to PCA. The results show that without sacrificing classification accuracy, performing RP prior to PCA significantly improves the computational time.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Random projection for dimensionality reduction--applied to time-of-flight secondary ion mass spectrometry data.

Random projection (RP) is a simple and fast linear method for dimensionality reduction of high-dimensional multivariate data, independent from the data. The method is briefly described and a new memory-saving algorithm is presented for the generation of random projection vectors. Application of RP to data from scanning experiments with a time-of-flight secondary ion mass spectrometer (TOF-SIMS)...

متن کامل

انجام یک مرحله پیش پردازش قبل از مرحله استخراج ویژگی در طبقه بندی داده های تصاویر ابر طیفی

Hyperspectral data potentially contain more information than multispectral data because of their higher spectral resolution. However, the stochastic data analysis approaches that have been successfully applied to multispectral data are not as effective for hyperspectral data as well. Various investigations indicate that the key problem that causes poor performance in the stochastic approaches t...

متن کامل

مدل ترکیبی تحلیل مؤلفه اصلی احتمالاتی بانظارت در چارچوب کاهش بعد بدون اتلاف برای شناسایی چهره

In this paper, we first proposed the supervised version of probabilistic principal component analysis mixture model. Then, we consider a learning predictive model with projection penalties, as an approach for dimensionality reduction without loss of information for face recognition. In the proposed method, first a local linear underlying manifold of data samples is obtained using the supervised...

متن کامل

2D Dimensionality Reduction Methods without Loss

In this paper, several two-dimensional extensions of principal component analysis (PCA) and linear discriminant analysis (LDA) techniques has been applied in a lossless dimensionality reduction framework, for face recognition application. In this framework, the benefits of dimensionality reduction were used to improve the performance of its predictive model, which was a support vector machine (...

متن کامل

The Use of Random Projections for the Analysis of Mass Spectrometry Imaging Data

The 'curse of dimensionality' imposes fundamental limits on the analysis of the large, information rich datasets that are produced by mass spectrometry imaging. Additionally, such datasets are often too large to be analyzed as a whole and so dimensionality reduction is required before further analysis can be performed. We investigate the use of simple random projections for the dimensionality r...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006